Take_Home Exercise 1: Geospatial Analytics for Social Good: Application of Spatial and Spatio-temporal Point Patterns Analysis to discover the geographical distribution of Armed Conflict in Myanmar

Author

Jiale SO

Published

August 22, 2024

Modified

September 15, 2024

dxIsk4EIsia10WN3u9Vj


#Rural Vs Urban Divide
#Population?
#State level
#Division Level
#Region?
#Ethinicity?
#Government? 
#Type of Acceident
#Death rate? 
#conflict rate?
#Using appropriate function of sf and tidyverse packages, import and transform the downloaded armed conflict data and administrative boundary data into sf tibble data.frames.
# Using the geospatial data sets prepared, derive quarterly KDE layers.
# Using the geospatial data sets prepared, perform 2nd-Order Spatial Point Patterns Analysis.
# Using the geospatial data sets prepared, derive quarterly spatio-temporal KDE layers.
# Using the geospatial data sets prepared, perform 2nd-Order Spatio-temporal Point Patterns Analysis.
# Using appropriate tmap functions, display the KDE and Spatio-temporal KDE layers on openstreetmap of Myanmar.
# Describe the spatial patterns revealed by the KDE and Spatio-temporal KDE maps.

1.0 Introduction

The study of armed conflicts in Myanmar has gained critical importance in understanding the geographical distribution and intensity of violence across different regions. Myanmar’s complex ethnic composition and ongoing civil strife make it a unique case for geospatial analysis. This project aims to apply spatial and spatio-temporal point pattern analysis methods to uncover the patterns of armed conflict between January 2021 and June 2024.

By leveraging conflict data from the Armed Conflict Location & Event Data (ACLED) and geospatial tools, we will focus on visualizing and interpreting conflict density through heat maps, Kernel Density Estimation (KDE), and advanced spatio-temporal analysis.

Our analysis will focus on four types of conflict events:

  1. Battles,
  2. Explosion/Remote violence,
  3. Strategic developments,
  4. Violence against civilians,

with particular attention paid to quarterly patterns in conflict occurrence. This study not only seeks to uncover spatial clusters of conflict but also to highlight areas of humanitarian concern.

2.0 Setting Up The Environment & Dataset

2.1 Installing the required Packages

Key Packages Used in the Project:

  1. sf: Handles simple features in R, allowing for spatial data manipulation and analysis. It is crucial for reading and managing geospatial data like shapefiles (e.g., Myanmar’s administrative boundaries).

  2. raster: Used for raster-based spatial data manipulation, especially for working with raster maps, such as Kernel Density Estimation (KDE) results.

  3. spatstat: A powerful package for spatial point pattern analysis. It helps to analyze and visualize spatial point data, particularly for identifying clusters or patterns in armed conflict events.

  4. sparr: Builds on spatstat and focuses on performing spatial and spatio-temporal kernel smoothing, which will be crucial for KDE and heatmap creation.

  5. tmap: A thematic mapping package that will allow us to create maps, including KDE visualizations on an OpenStreetMap base.

  6. tidyverse: A collection of data manipulation packages like dplyr, ggplot2, and purrr. It’s essential for data cleaning, manipulation, and visualization tasks.

  7. stpp: Used for spatio-temporal point pattern analysis, crucial for analyzing how conflict events evolve in both space and time.

  8. skimr: A quick and comprehensive tool to provide summaries and descriptive statistics for datasets, helping in the initial exploration.

  9. gganimate: Extends ggplot2 to create animated visualizations. We can use this for animated time-series or evolving conflict maps.

  10. ggplot2: The core plotting package in R, essential for creating visualizations like time series plots and KDE heatmaps.

  11. plotly: Useful for creating interactive visualizations, allowing users to explore spatial data interactively (e.g., hover over points to see conflict details).

  12. pacman: is a package management tool in R designed to streamline the process of loading and installing packages.

pacman::p_load(sf, raster, spatstat, sparr, tmap, tidyverse, stpp, skimr, gganimate, ggplot2, plotly, flexdashboard, shiny)

2.2 Data-set involved in this topic

For this analysis, we use two key datasets:

2.2.1 ACLED Armed Conflict Data

Location & Event Data (ACLED)platform, which maintains an extensive record of conflict events globally. For this specific analysis, we will limit the dataset by filtering based on the following parameters to streamline data preparation and minimize the need for extensive data cleaning:

Data Parameter Filter Category
Date Range From 01/01/2021 to 30/06/2024.
Event Type 1. Battles
2. Violence Against Civilians
3. Explosions/Remote Violence
4. Strategic Developments
Country Myanmar
ACLED Configuration Image

Code to Import ACLED Dataset
ACLEDData <- read_csv("data/raw/aspatial/2021-01-01-2024-06-30-Myanmar.csv")
Rows: 42608 Columns: 35
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (20): event_id_cnty, event_date, disorder_type, event_type, sub_event_ty...
dbl (15): year, time_precision, inter1, inter2, interaction, iso, latitude, ...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2.2.1.1 Understanding the data set fields.

Referencing this ACLED Official codebook, this is the dataset that we are working with, not to bore you with the details are mainly interested in the following fields,

  • Event ID: Unique identifier for each conflict event.

  • Event Date: Date of occurrence.

  • Event Type: Type of conflict event (e.g., Battles, Remote Violence).

  • Latitude/Longitude: Coordinates of the event.

  • Fatalities: Number of fatalities resulting from the event.

  • Actors: The groups or individuals involved in the conflict (e.g., state actors, ethnic armed groups).

  • Admin Levels: Administrative region, district, and township where the event took place.

If you’re interested in the data set fields to explore more, here’s the full fields!

ACLED Full Table Fields Summary
Fields name Fields Description Values
event_id_cnty A unique alphanumeric event identifier by number and country acronym. This identifier remains constant even when the event details are updated. E.g. ETH9766
event_date The date on which the event took place. Recorded as Year-Month-Day. E.g. 2023-02-16
year The year in which the event took place. E.g. 2018
time_precision A numeric code between 1 and 3 indicating the level of precision of the date recorded for the event. The higher the number, the lower the precision. 1, 2, or 3; with 1 being the most precise.
disorder_type The disorder category an event belongs to. Political violence, Demonstrations, or Strategic developments.
event_type The type of event; further specifies the nature of the event. E.g. BattlesFor the full list of ACLED event types, see the ACLED Event Types table.
sub_event_type A subcategory of the event type. E.g. Armed clashFor the full list of ACLED sub-event types, see the ACLED Event Types table.
actor1 One of two main actors involved in the event (does not necessarily indicate the aggressor). E.g. Rioters (Papua New Guinea)
assoc_actor_1 Actor(s) involved in the event alongside ‘Actor 1’ or actor designations that further identify ‘Actor 1’. E.g. Labor Group (Spain); Women (Spain)Can have multiple actors separated by a semicolon, or can be blank.
inter1 A numeric code between 0 and 8 indicating the type of ‘Actor 1’ (for more, see the section Actor Names, Types, and ‘Inter’ Codes). 1, 2, 3, 4, 5, 6, 7, or 8.
actor2 One of two main actors involved in the event (does not necessarily indicate the target or victim). E.g. Civilians (Kenya)Can be blank.
assoc_actor_2 Actor(s) involved in the event alongside ‘Actor 2’ or actor designation further identifying ‘Actor 2’. E.g. Labor Group (Spain); Women (Spain)Can have multiple actors separated by a semicolon, or can be blank.
inter2 A numeric code between 0 to 8 indicating the type of ‘Actor 2’ (for more, see the section Actor Names, Types, and ‘Inter’ Codes). 0, 1, 2, 3, 4, 5, 6, 7, or 8.
interaction A two-digit numeric code (combination of ‘Inter 1’ and ‘Inter 2’) indicating the two actor types interacting in the event (for more, see the section Actor Names, Types, and ‘Inter’ Codes). E.g.3, 58
civilian_targeting This column indicates whether the event involved civilian targeting. Either ‘Civilians targeted’ or blank.
iso A unique three-digit numeric code assigned to each country or territory according to ISO 3166. E.g. 231 for Ethiopia
region The region of the world where the event took place. E.g. Eastern Africa
country The country or territory in which the event took place. E.g. Ethiopia
admin1 The largest sub-national administrative region in which the event took place. E.g. Oromia
admin2 The second largest sub-national administrative region in which the event took place. E.g. ArsiCan be blank.
admin3 The third largest sub-national administrative region in which the event took place. E.g. MertiCan be blank.
location The name of the location at which the event took place. E.g. Abomsa
latitude The latitude of the location in four decimal degrees notation (EPSG:4326). E.g. 8.5907
longitude The longitude of the location in four decimal degrees notation (EPSG:4326). E.g. 39.8588
geo_precision A numeric code between 1 and 3 indicating the level of certainty of the location recorded for the event. The higher the number, the lower the precision. 1, 2, or 3; with 1 being the most precise.
source The sources used to record the event. Separated by a semicolon. E.g. Ansar Allah; Yemen Data Project
source_ scale An indication of the geographic closeness of the used sources to the event (for more, see the section Source Scale). E.g. Local partner-National
notes A short description of the event. E.g. On 16 February 2023, OLF-Shane abducted an unidentified number of civilians after stopping a vehicle in an area near Abomsa (Merti, Arsi, Oromia). The abductees were traveling from Adama to Abomsa, Arsi.
fatalities The number of reported fatalities arising from an event. When there are conflicting reports, the most conservative estimate is recorded. E.g. 3No information on fatalities is recorded as 0 reported fatalities.
tags Additional structured information about the event. Separated by a semicolon. E.g. women targeted: politicians; sexual violence
timestamp An automatically generated Unix timestamp that represents the exact date and time an event was uploaded to the ACLED API. E.g. 1676909320

2.2.2 Myanmar Administrative Boundaries (Shapefiles):

Obtained through Geonode Mimu, this shapefile helps us to build the map and set the boundary zone of each district of myanmar. This dataset provides the geographical boundaries of Myanmar’s administrative divisions, from the national level down to the township level. It is essential for mapping conflict events to specific regions.

Note

Myanmar has State, District and Township level, why District Level?

Choosing the district level over the township level for conflict analysis provides a better balance between detail and clarity. The district level allows us to capture broader regional trends without overwhelming the analysis with too many granular data points, as township-level data can be overly detailed. It also improves computational efficiency and makes visualizations clearer, while still offering enough specificity to reveal conflict hotspots. Additionally, population and auxiliary data are more readily available at the district level, making the analysis more consistent and manageable.

Code to Import Shapefile Dataset
m_sf <- st_read(dsn="data/raw/geospatial", layer = "mmr_polbnda_adm2_250k_mimu") 
Reading layer `mmr_polbnda_adm2_250k_mimu' from data source 
  `C:\Users\jiale\Desktop\IS415\IS415-GAA\Take_Home_Exercises\Take_Home_Exercise_1\data\raw\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 80 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 92.1721 ymin: 9.696844 xmax: 101.17 ymax: 28.54554
Geodetic CRS:  WGS 84

2.2.2.1 Understanding the data set fields.

Field Name Description
OBJECTID This is a unique identifier for each feature in the dataset, typically used to identify individual records or polygons in the shapefile.
ST This represents the State or Region in Myanmar. For example, in your dataset, “Ayeyarwady” refers to a state/region.
ST_PCODE This stands for State Postal Code. It is a standardized code that represents each state or region in Myanmar, such as “MMR017” for Ayeyarwady.
DT This stands for District or Township within the respective state/region. For example, “Hinthada” is a district or township within Ayeyarwady.
DT_PCODE This stands for District/Township Postal Code. It is a standardized postal code for each district or township, such as “MMR017D002” for the Hinthada district/township in Ayeyarwady.
DT_MMR This field could be the District/Township name in Myanmar script, written in the local language. It may be an alternative representation of the “DT” field, showing the name of the district/township in Myanmar’s native language.
PCODE_V This could be a Version of the Postal Code or a verification value used internally in the dataset. In this case, the value is “9.4”, possibly indicating a specific version of postal codes or an accuracy measure.
geometry This column represents the spatial data for each district/township. It contains the geometrical shape (MULTIPOLYGON) defining the boundaries of the state or district/township, with coordinates provided in longitude and latitude.

3.0 Data Pre-processing

To ensure accuracy and usability of the data, several preprocessing steps will be undertaken for the different datasets.

3.1 Myanmar Shapefile

3.1.1 Setting the CRS for the

Since Myanmar uses CRS of 4326 and when we download the map it’s in WGS84, we should change it to 4326 .

st_crs(m_sf)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
# Set the CRS for m_sf, assuming the appropriate CRS is WGS 84 (EPSG:4326)
m_sf <- st_set_crs(m_sf, 4326)
# Verify that the CRS has been correctly set
print(st_crs(m_sf))
Coordinate Reference System:
  User input: EPSG:4326 
  wkt:
GEOGCRS["WGS 84",
    ENSEMBLE["World Geodetic System 1984 ensemble",
        MEMBER["World Geodetic System 1984 (Transit)"],
        MEMBER["World Geodetic System 1984 (G730)"],
        MEMBER["World Geodetic System 1984 (G873)"],
        MEMBER["World Geodetic System 1984 (G1150)"],
        MEMBER["World Geodetic System 1984 (G1674)"],
        MEMBER["World Geodetic System 1984 (G1762)"],
        MEMBER["World Geodetic System 1984 (G2139)"],
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]],
        ENSEMBLEACCURACY[2.0]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    USAGE[
        SCOPE["Horizontal component of 3D system."],
        AREA["World."],
        BBOX[-90,-180,90,180]],
    ID["EPSG",4326]]

3.1.2 Renaming and removal of column names and

colnames(m_sf) <- c("OBJECTID", "state", "state_code", "district", "district_code", "district_mmr", "mimi_version", "geometry")
m_sf <- m_sf %>% select(state, district, district_mmr, geometry)

3.1.3 Checking for validity of data

When working with spatial data, it’s crucial to ensure that all geometries are valid. Invalid geometries can cause errors in analysis and visualization.

  1. Checking Validity with st_is_valid():

  2. Identifying Invalid Geometries:

  3. Fixing Invalid Geometries with st_make_valid()

#checking if it's valid
m_sf_validity <- st_is_valid(m_sf)
m_sf_invalid <- which(!m_sf_validity)
if (length(m_sf_invalid) > 0) {
  print("MPZ Invalid!")
  print(m_sf[m_sf_invalid, ])
} else {
  print("it's valid!")
}
[1] "it's valid!"

3.1.4 Visualizing the mynamar map

On the top right, you can toggle between the district level and also the state level to understand more about the boundaries of Myanmar.

tmap_mode("view")  # Enable interactive mode for tmap
tmap mode set to interactive viewing
# Step 3: Create the base map colored by district, and add markers with size based on fatalities
tm_shape(m_sf) +  # Base map (Myanmar boundaries)
  tm_polygons("district",  # Color the base map by district
              palette = "Set3",  # Use Set3 color palette for districts
              border.col = "gray", 
              alpha = 0.5,  # Semi-transparent polygons
              title = "Districts of Myanmar") + 
  
  # Adjust the legend layout to control size
  tm_layout(
    legend.outside = TRUE,  # Place the legend outside the map
    legend.outside.position = "right",  # Position the legend on the right side
    legend.height = 0.3,  # Control legend height
    legend.width = 0.2,   # Control legend width
    legend.text.size = 0.8,  # Adjust the text size in the legend
    legend.title.size = 1    # Adjust the title size in the legend
  )
Warning in pre_process_gt(x, interactive = interactive, orig_crs =
gm$shape.orig_crs): legend.width controls the width of the legend within a map.
Please use legend.outside.size to control the width of the outside legend
Warning: Number of levels of the variable "district" is 80, which is larger
than max.categories (which is 30), so levels are combined. Set
tmap_options(max.categories = 80) in the layer function to show all levels.

3.2 ACLED Data

3.2.1 Changing the Column Names

Since Myanmar’s regional hierarchy follows State, District, and Township levels, we will rename the columns accordingly:

  • admin1State

  • admin2District

  • admin3Township

This is important because different countries use different administrative hierarchies. For example, in Singapore, the hierarchy is organized by Region and Subzones. Adjusting these names ensures that our dataset aligns with Myanmar’s specific regional structure for accurate analysis.

ACLEDDataCleanse <- ACLEDData
colnames(ACLEDDataCleanse)[which(names(ACLEDDataCleanse) == "admin1")] <- "state"
colnames(ACLEDDataCleanse)[which(names(ACLEDDataCleanse) == "admin2")] <- "district"
colnames(ACLEDDataCleanse)[which(names(ACLEDDataCleanse) == "admin3")] <- "township"

3.2.2 Adding a “Quarter-Year” Column

To facilitate our temporal analysis, we need to add a “quarter-year” column based on the event_date field. This can be done by adjusting the date format to represent the quarter and year, ensuring that each event is categorized by the specific quarter it occurred in (e.g., Q1-2021, Q2-2022). This will allow for easier analysis of conflict trends over time.

# Convert event_date to Date format (if it's not already a date)
ACLEDDataCleanse$event_date <- as.Date(ACLEDDataCleanse$event_date, format="%d-%b-%y")  # Adjust the format if needed
# Add a new column that shows the quarter and year
ACLEDDataCleanse <- ACLEDDataCleanse %>%
  mutate(quarter_year = paste0("Q", quarter(event_date), "-", year(event_date)))

head(ACLEDDataCleanse)
# A tibble: 6 × 36
  event_id_cnty event_date  year time_precision disorder_type         event_type
  <chr>         <date>     <dbl>          <dbl> <chr>                 <chr>     
1 MMR64313      2024-06-30  2024              1 Political violence    Battles   
2 MMR64320      2024-06-30  2024              1 Political violence    Battles   
3 MMR64321      2024-06-30  2024              1 Political violence    Battles   
4 MMR64322      2024-06-30  2024              1 Strategic developmen… Strategic…
5 MMR64323      2024-06-30  2024              1 Political violence    Battles   
6 MMR64324      2024-06-30  2024              1 Strategic developmen… Strategic…
# ℹ 30 more variables: sub_event_type <chr>, actor1 <chr>, assoc_actor_1 <chr>,
#   inter1 <dbl>, actor2 <chr>, assoc_actor_2 <chr>, inter2 <dbl>,
#   interaction <dbl>, civilian_targeting <chr>, iso <dbl>, region <chr>,
#   country <chr>, state <chr>, district <chr>, township <chr>, location <chr>,
#   latitude <dbl>, longitude <dbl>, geo_precision <dbl>, source <chr>,
#   source_scale <chr>, notes <chr>, fatalities <dbl>, tags <chr>,
#   timestamp <dbl>, population_1km <dbl>, population_2km <dbl>, …

3.2.3 Joining ACLED’s Codebook Description

ACLED’s stores their data for the column “interaction” and “inter1” and “inter2” in codes, using their code book, let’s reorganise their data for simplier view, we can reference the code book here to know what each code represent. Map it out as a csv file read it in and change accordingly.

3.2.3.1 Left joining inter1 and inter’s description.

For more details about each inter code read here.

ACLEDActorInterCode <- read_csv("data/raw/aspatial/ActorTypesInterCode.csv")
Rows: 8 Columns: 2
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Description
dbl (1): code

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ACLEDDataCleanse <- ACLEDDataCleanse %>%
  left_join(ACLEDActorInterCode, by = c("inter1" = "code")) %>%
  rename(inter1_description = Description)
# Left join again for inter2
ACLEDDataCleanse <- ACLEDDataCleanse %>%
  left_join(ACLEDActorInterCode, by = c("inter2" = "code")) %>%
  rename(inter2_description = Description)

head(ACLEDDataCleanse)
# A tibble: 6 × 38
  event_id_cnty event_date  year time_precision disorder_type         event_type
  <chr>         <date>     <dbl>          <dbl> <chr>                 <chr>     
1 MMR64313      2024-06-30  2024              1 Political violence    Battles   
2 MMR64320      2024-06-30  2024              1 Political violence    Battles   
3 MMR64321      2024-06-30  2024              1 Political violence    Battles   
4 MMR64322      2024-06-30  2024              1 Strategic developmen… Strategic…
5 MMR64323      2024-06-30  2024              1 Political violence    Battles   
6 MMR64324      2024-06-30  2024              1 Strategic developmen… Strategic…
# ℹ 32 more variables: sub_event_type <chr>, actor1 <chr>, assoc_actor_1 <chr>,
#   inter1 <dbl>, actor2 <chr>, assoc_actor_2 <chr>, inter2 <dbl>,
#   interaction <dbl>, civilian_targeting <chr>, iso <dbl>, region <chr>,
#   country <chr>, state <chr>, district <chr>, township <chr>, location <chr>,
#   latitude <dbl>, longitude <dbl>, geo_precision <dbl>, source <chr>,
#   source_scale <chr>, notes <chr>, fatalities <dbl>, tags <chr>,
#   timestamp <dbl>, population_1km <dbl>, population_2km <dbl>, …

3.2.3.2 Left joining interaction description.

For more details about each interaction code read here.

ACLEDInteractionCode <- read_csv("data/raw/aspatial/AcledInteractionCodes.csv")
Rows: 44 Columns: 2
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Description
dbl (1): code

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ACLEDDataCleanse <- ACLEDDataCleanse %>%
  left_join(ACLEDInteractionCode, by = c("interaction" = "code")) %>%
  rename(interaction_description = Description)

head(ACLEDDataCleanse)
# A tibble: 6 × 39
  event_id_cnty event_date  year time_precision disorder_type         event_type
  <chr>         <date>     <dbl>          <dbl> <chr>                 <chr>     
1 MMR64313      2024-06-30  2024              1 Political violence    Battles   
2 MMR64320      2024-06-30  2024              1 Political violence    Battles   
3 MMR64321      2024-06-30  2024              1 Political violence    Battles   
4 MMR64322      2024-06-30  2024              1 Strategic developmen… Strategic…
5 MMR64323      2024-06-30  2024              1 Political violence    Battles   
6 MMR64324      2024-06-30  2024              1 Strategic developmen… Strategic…
# ℹ 33 more variables: sub_event_type <chr>, actor1 <chr>, assoc_actor_1 <chr>,
#   inter1 <dbl>, actor2 <chr>, assoc_actor_2 <chr>, inter2 <dbl>,
#   interaction <dbl>, civilian_targeting <chr>, iso <dbl>, region <chr>,
#   country <chr>, state <chr>, district <chr>, township <chr>, location <chr>,
#   latitude <dbl>, longitude <dbl>, geo_precision <dbl>, source <chr>,
#   source_scale <chr>, notes <chr>, fatalities <dbl>, tags <chr>,
#   timestamp <dbl>, population_1km <dbl>, population_2km <dbl>, …

3.2.3 Making it a SF Object and reverse geolocate state and district

Since ACLED provides longitude and latitude data, I prefer to reverse geolocate the points to match Myanmar’s official state and district boundaries. We are uncertain how ACLED assigns these regions, so to ensure consistency, we remove the original state and district columns from the ACLED data and replace them with the geolocated values.

Steps:

  1. Convert ACLED Data to an SF Object: Using longitude and latitude coordinates, transform the ACLED dataset into a spatial object. Remember taht we have to set CRS 4326 here as well.

  2. Perform a Spatial Join: Match the points from ACLED with the corresponding state and district boundaries from the m_sf shapefile, selecting only those columns.

  3. Remove Original Columns: After the spatial join, drop the original state, district, and township columns from the ACLED dataset.

  4. Rename the Joined Columns: Rename the newly joined state.y and district.y to state and district, effectively replacing the original columns with the reverse-geolocated values.

# Step 1: Convert ACLEDDataCleanse to an sf object using longitude and latitude
ACLEDDataCleanse_sf <- st_as_sf(ACLEDDataCleanse, coords = c("longitude", "latitude"), crs = 4326, remove = FALSE)

# Step 2: Perform a spatial join, selecting only the state and district from m_sf
reverse_geolocated <- st_join(ACLEDDataCleanse_sf, m_sf[, c("state", "district")], join = st_intersects)

# Step 3: Remove original 'state', 'district', and 'township' columns from ACLEDDataCleanse (if they exist)
ACLEDData_sf <- reverse_geolocated %>%
  select(-contains("state.x"), -contains("district.x"), -contains("township")) %>%  # Remove original state, district, township
  rename(state = state.y, district = district.y)  # Rename the newly joined columns

#Step 4 Set CRS back to EPSG: 4326
ACLEDData_sf <- st_set_crs(ACLEDData_sf, 4326)

# Step 5: View the cleaned ACLEDDataCleanse dataset
head(ACLEDData_sf)
Simple feature collection with 6 features and 38 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 94.9021 ymin: 21.9251 xmax: 96.2364 ymax: 22.88
Geodetic CRS:  WGS 84
# A tibble: 6 × 39
  event_id_cnty event_date  year time_precision disorder_type         event_type
  <chr>         <date>     <dbl>          <dbl> <chr>                 <chr>     
1 MMR64313      2024-06-30  2024              1 Political violence    Battles   
2 MMR64320      2024-06-30  2024              1 Political violence    Battles   
3 MMR64321      2024-06-30  2024              1 Political violence    Battles   
4 MMR64322      2024-06-30  2024              1 Strategic developmen… Strategic…
5 MMR64323      2024-06-30  2024              1 Political violence    Battles   
6 MMR64324      2024-06-30  2024              1 Strategic developmen… Strategic…
# ℹ 33 more variables: sub_event_type <chr>, actor1 <chr>, assoc_actor_1 <chr>,
#   inter1 <dbl>, actor2 <chr>, assoc_actor_2 <chr>, inter2 <dbl>,
#   interaction <dbl>, civilian_targeting <chr>, iso <dbl>, region <chr>,
#   country <chr>, location <chr>, latitude <dbl>, longitude <dbl>,
#   geo_precision <dbl>, source <chr>, source_scale <chr>, notes <chr>,
#   fatalities <dbl>, tags <chr>, timestamp <dbl>, population_1km <dbl>,
#   population_2km <dbl>, population_5km <dbl>, population_best <dbl>, …
# Step 6: Check the class of the result
print(st_crs(ACLEDData_sf))
Coordinate Reference System:
  User input: EPSG:4326 
  wkt:
GEOGCRS["WGS 84",
    ENSEMBLE["World Geodetic System 1984 ensemble",
        MEMBER["World Geodetic System 1984 (Transit)"],
        MEMBER["World Geodetic System 1984 (G730)"],
        MEMBER["World Geodetic System 1984 (G873)"],
        MEMBER["World Geodetic System 1984 (G1150)"],
        MEMBER["World Geodetic System 1984 (G1674)"],
        MEMBER["World Geodetic System 1984 (G1762)"],
        MEMBER["World Geodetic System 1984 (G2139)"],
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]],
        ENSEMBLEACCURACY[2.0]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    USAGE[
        SCOPE["Horizontal component of 3D system."],
        AREA["World."],
        BBOX[-90,-180,90,180]],
    ID["EPSG",4326]]

3.2.4 Visualizing it by Event Type

# Step 1: Cleanse the ACLEDData_sf dataset (select necessary columns for markers)
ACLEDData_cleaned <- ACLEDData_sf %>%
  select(event_type, actor1, fatalities, geometry)  # Adjust columns to match your dataset

# Step 2: Create the tmap object
tmap_mode("view")  # Enable interactive mode for tmap
tmap mode set to interactive viewing
# Step 3: Create the base map colored by district, and add markers with size based on fatalities
tm_shape(m_sf) +  # Base map (Myanmar boundaries)
  tm_polygons("district",  # Color the base map by district
              palette = "Set3",  # Use Set3 color palette for districts
              border.col = "gray", 
              alpha = 0.5,  # Semi-transparent polygons
              title = "Districts of Myanmar") +  # Title for the district legend
  
  # Add event markers with size based on fatalities
  tm_shape(ACLEDData_cleaned) + 
  tm_bubbles(size = "fatalities",  # Marker size based on fatalities
             col = "event_type",    # Color markers by event_type
             popup.vars = c("event_type", "actor1", "fatalities"),  # Display these fields in the popup
             palette = "Set1",       # Use Set1 color palette for event types
             border.col = "black",   # Set border color for the bubbles
             border.alpha = 0.5,     # Semi-transparent border
             title.size = "Number of Fatalities",  # Title for bubble size legend
             title.col = "Event Types") +  # Title for event type legend
  
  # Customize the layout
  tm_layout(
    title = "Event Types and Fatalities in Myanmar",  
    title.position = c("center", "top"),
    legend.outside = TRUE,  # Place the legend outside for clarity
    legend.title.size = 1,  # Adjust the title size for the legend
    legend.text.size = 0.8  # Adjust the text size for the legend
  )
Warning: Number of levels of the variable "district" is 80, which is larger
than max.categories (which is 30), so levels are combined. Set
tmap_options(max.categories = 80) in the layer function to show all levels.
Legend for symbol sizes not available in view mode.

3.3 Aggregation of Data For Explanatory Purposes

# Fixing the expand.grid function to use consistent column names
all_combinations <- expand.grid(
  district = unique(m_sf$district),                # Unique districts from shapefile
  event_type = unique(ACLEDDataCleanse$event_type), # Unique event types
  quarter_year = unique(ACLEDDataCleanse$quarter_year), # Unique quarters
  interaction_description = unique(ACLEDDataCleanse$interaction_description),  # Unique interactions
  inter1_description = unique(ACLEDDataCleanse$inter1_description),  # Actor 1 descriptions
  inter2_description = unique(ACLEDDataCleanse$inter2_description)   # Actor 2 descriptions
)

# Aggregate the data by quarter_year, district, event_type, interaction_description, inter1_description, inter2_description
agg_data <- ACLEDDataCleanse %>%
  group_by(quarter_year, district, event_type, interaction_description, inter1_description, inter2_description) %>%
  summarise(count = n(),                       # Count the number of events
            fatalities = sum(fatalities, na.rm = TRUE),  # Sum the fatalities
            .groups = "drop")

# Perform a full join with all_combinations to keep all districts, event types, interactions, etc.
full_data <- all_combinations %>%
  left_join(agg_data, by = c("district", "event_type", "quarter_year", "interaction_description", "inter1_description", "inter2_description")) %>%
  mutate(count = ifelse(is.na(count), 0, count),        # Replace NAs in count with 0
         fatalities = ifelse(is.na(fatalities), 0, fatalities))  # Replace NAs in fatalities with 0

3.4 Summary of Data sets

We have created a total of 3 Data set,

  1. ACLEDData_SF
  2. M_SF ->
  3. Aggregated Level

4.0 Exploratory Data analysis

With the data cleansed let’s do a very touch down analysis on our data-set, to figure out what are some statistic of the analysis Below are some exploratory analysis on the conflicts of

4.1 Number of Conflicts and Across State, Township and Quarter, By Event Type.

Alternatively, use this dashboard to play around with the data set to see what you’re working with.

Number

5.0 Setting The Stage for Analysis

5.1 Derive Quarterly Kernel Density E Layers:

ACLEDDataCleanse
# A tibble: 42,608 × 39
   event_id_cnty event_date  year time_precision disorder_type        event_type
   <chr>         <date>     <dbl>          <dbl> <chr>                <chr>     
 1 MMR64313      2024-06-30  2024              1 Political violence   Battles   
 2 MMR64320      2024-06-30  2024              1 Political violence   Battles   
 3 MMR64321      2024-06-30  2024              1 Political violence   Battles   
 4 MMR64322      2024-06-30  2024              1 Strategic developme… Strategic…
 5 MMR64323      2024-06-30  2024              1 Political violence   Battles   
 6 MMR64324      2024-06-30  2024              1 Strategic developme… Strategic…
 7 MMR64325      2024-06-30  2024              1 Political violence   Battles   
 8 MMR64326      2024-06-30  2024              1 Political violence   Battles   
 9 MMR64328      2024-06-30  2024              1 Political violence   Battles   
10 MMR64330      2024-06-30  2024              1 Political violence   Battles   
# ℹ 42,598 more rows
# ℹ 33 more variables: sub_event_type <chr>, actor1 <chr>, assoc_actor_1 <chr>,
#   inter1 <dbl>, actor2 <chr>, assoc_actor_2 <chr>, inter2 <dbl>,
#   interaction <dbl>, civilian_targeting <chr>, iso <dbl>, region <chr>,
#   country <chr>, state <chr>, district <chr>, township <chr>, location <chr>,
#   latitude <dbl>, longitude <dbl>, geo_precision <dbl>, source <chr>,
#   source_scale <chr>, notes <chr>, fatalities <dbl>, tags <chr>, …

5.2 Perform 2nd-Order Spatial Point Pattern Analysis:

5.3: Derive Quarterly Spatio-Temporal KDE Layers:

5.4: Perform 2nd-Order Spatio-Temporal Point Patterns Analysis:

5.5: Visualize with tmap on OpenStreetMap:
5.6:Describe Spatial Patterns:

# Sidebar filter optio

6.0 Advanced Geospatial Analysis

7.0 Results and Interpretation

8.0 Conclusion

9.0 References